Latent topic model for audio retrieval
نویسندگان
چکیده
Latent topic model such as Latent Dirichlet Allocation (LDA) has been designed for text processing and has also demonstrated success in the task of audio related processing. The main idea behind LDA assumes that the words of each document arise from a mixture of topics, each of which is a multinomial distribution over the vocabulary. When applying the original LDA to process continuous data, the wordlike unit need be first generated by vector quantization (VQ). This data discretization usually results in information loss. To overcome this shortage, this paper introduces a new topic model named GaussianLDA for audio retrieval. In the proposed model, we consider continuous emission probability, Gaussian instead of multinomial distribution. This new topic model skips the vector quantization and directly models each topic as a Gaussian distribution over audio features. It avoids discretization by this way and integrates the procedure of clustering. The experiments of audio retrieval demonstrate that GaussianLDA achieves better performance than other compared methods. & 2013 Elsevier Ltd. All rights reserved.
منابع مشابه
Supervised acoustic topic model for unstructured audio information retrieval
We introduce a modified version of the acoustic topic model, which assumes an audio signal consists of latent acoustic topics and each topic can be interpreted as a distribution over acoustic words, for unstructured audio information retrieval applications. The proposed supervised acoustic topic model is based on supervised latent Dirichlet allocation (sLDA) while the conventional acoustic topi...
متن کاملAutomatic Audio Tagging and Retrieval Using Semi-Surpervised Canonical Density Estimation
We apply SSCDE (semi-supervised canonical density estimation), a semi-supervised learning method based on topic modeling, to audio tagging and retrieval problems. SSCDE was originally proposed as an image annotaion and retireval method, but it can also be applied to audio data. The SSCDE method consists of two parts: 1) extraction of a low-dimentional latent space representing topics of sounds ...
متن کاملUsing Naı̈ve Text Queries for Robust Audio Information Retrieval
The goal of this work is to build an audio information retrieval system which provides users with flexibility in formulating their queries: from audio examples to naı̈ve text. Specifically, the focus of this paper is on using naı̈ve text to create input queries describing the desired information of the users. Using naı̈ve text queries, however, raises interoperability issues between annotation and...
متن کاملUsing naïve text queries for robust audio information retrieval
The goal of this work is to build an audio information retrieval system which provides users with flexibility in formulating their queries: from audio examples to naı̈ve text. Specifically, the focus of this paper is on using naı̈ve text to create input queries describing the desired information of the users. Using naı̈ve text queries, however, raises interoperability issues between annotation and...
متن کاملStudy of entity-topic models for OOV proper name retrieval
Retrieving Proper Names (PNs) relevant to an audio document can improve speech recognition and content based audio-video indexing. Latent Dirichlet Allocation (LDA) topic model has been used to retrieve Out-Of-Vocabulary (OOV) PNs relevant to an audio document with good recall rates. However, retrieval of OOV PNs using LDA is affected by two issues, which we study in this paper: (1) Word Freque...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 47 شماره
صفحات -
تاریخ انتشار 2014